Search CORE

2 research outputs found

YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving

Author: Boulay Thomas
El-Hachimi Said
Kandan Saranya
Maddu Pullarao
Surisetti Mani Kumar
Publication venue
Publication date: 11/04/2019
Field of study

In this paper, we propose a multi-task convolutional neural network (CNN) architecture optimized for a low power automotive grade SoC. We introduce a network based on a unified architecture where the encoder is shared among the two tasks namely detection and segmentation. The pro-posed network runs at 25FPS for 1280x800 resolution. We briefly discuss the methods used to optimize the network architecture such as using native YUV image directly, optimization of layers & feature maps and applying quantization. We also focus on memory bandwidth in our design as convolutions are data intensives and most SOCs are bandwidth bottlenecked. We then demonstrate the efficiency of our proposed network for a dedicated CNN accelerators presenting the key performance indicators (KPI) for the detection and segmentation tasks obtained from the hardware execution and the corresponding run-time.Comment: This paper is accepted for CVPR workshop dem

arXiv.org e-Print Archive

Design of Real-time Semantic Segmentation Decoder for Automated Driving

Author: Das Arindam
Kandan Saranya
Krizek Pavel
Yogamani Senthil
Publication venue
Publication date: 19/01/2019
Field of study

Semantic segmentation remains a computationally intensive algorithm for embedded deployment even with the rapid growth of computation power. Thus efficient network design is a critical aspect especially for applications like automated driving which requires real-time performance. Recently, there has been a lot of research on designing efficient encoders that are mostly task agnostic. Unlike image classification and bounding box object detection tasks, decoders are computationally expensive as well for semantic segmentation task. In this work, we focus on efficient design of the segmentation decoder and assume that an efficient encoder is already designed to provide shared features for a multi-task learning system. We design a novel efficient non-bottleneck layer and a family of decoders which fit into a small run-time budget using VGG10 as efficient encoder. We demonstrate in our dataset that experimentation with various design choices led to an improvement of 10\% from a baseline performance.Comment: Accepted at VISAPP 201

arXiv.org e-Print Archive